Audio-visual Convolutive Blind Source Separation

نویسندگان

  • Qingju Liu
  • Wenwu Wang
  • Philip Jackson
چکیده

We present a novel method for speech separation from their audio mixtures using the audio-visual coherence. It consists of two stages: in the off-line training process, we use the Gaussian mixture model to characterise statistically the audiovisual coherence with features obtained from the training set; at the separation stage, likelihood maximization is performed on the independent component analysis (ICA)-separated spectral components. To address the permutation and scaling indeterminacies of the frequency-domain blind source separation (BSS), a new sorting and rescaling scheme using the bimodal coherence is proposed. We tested our algorithm on the XM2VTS database, and the results show that our algorithm can address the permutation problem with high accuracy, and mitigate the scaling problem effectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using the Bi-modality of Speech for Convolutive Frequency Domain Blind Speech Separation

The problem of blind source separation for the case of convolutive mixtures of speech is considered. A novel algorithm is proposed that exploits the bi-modality of speech. This is achieved by incorporating joint audio-visual features into an existing BSS algorithm for the purpose of improving the convergence rate of the source separation algorithm. The increase in the rate of convergence when u...

متن کامل

A Survey of Convolutive Blind Source Separation Methods

In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio separation tasks.

متن کامل

Blind Source Separation of Convolutive Audio Using an Adaptive Stereo Basis

We consider the problem of convolutive blind source separation of audio mixtures. We propose an Adaptive Stereo Basis (ASB) method based on learning a set of basis vectors pairs from the time-domain stereo mixtures. The basis vector pairs are clustered using estimated directions of arrival (DOAs) such that each basis vector pair is associated with one source. The ASB method is compared with the...

متن کامل

Audio source separation of convolutive mixtures

The problem of separation of audio sources recorded in a real world situation is well established in modern literature. A method to solve this problem is Blind Source Separation (BSS) using Independent Component Analysis (ICA). The recording environment is usually modeled as convolutive. Previous research on ICA of instantaneous mixtures provided solid background for the separation of convolved...

متن کامل

Undetermined Convolutive Blind Source Separation

This paper presents a blind source separation process for convolutive mixtures of audio sources. Here undetermined condition that is few microphones than sources has been considered as a mixing model. By an expectation–maximization (EM) algorithm the separation operation is performed in the frequency domain. The T-F masking separation is made use which is a powerful approach for the separation ...

متن کامل

Blind Source Separation of Convolutive Mixtures of Speech in Frequency Domain

This paper overviews a total solution for frequencydomain blind source separation (BSS) of convolutive mixtures of audio signals, especially speech. Frequency-domain BSS performs independent component analysis (ICA) in each frequency bin, and this is more efficient than time-domain BSS. We describe a sophisticated total solution for frequency-domain BSS, including permutation, scaling, circular...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010